Automatic Publication Data

نویسندگان

  • Nguyen Thi
  • Hoang Oanh
  • Kan Min Yen
چکیده

In many universities it would be useful to have a database of publications that reflects the research results of the academic staffs. Such a database can be built by automatically retrieve publication information from faculties’ homepage. In this project, we deploy focused crawling to build such a system. We also proposed a new focused crawling heuristics based on URL classification. We compare the performance of our proposed method with breadth first crawling and a variant of context focused crawling. Experiment results show that our new heuristics can find target page faster, avoid irrelevant page better, outperforms other crawling methods. Subject Descriptors: H3.1 Content Analysis and Indexing H.3.3 Information Search and Retrieval I.2.7 Natural Language Processing I.2.8 Problem Solving, Control Methods, and Search

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Metadata Enrichment for Automatic Data Entry Based on Relational Data Models

The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...

متن کامل

Predicting of Students' Anxiety on the basis of Emotional Regulation Difficulties and Negative Automatic Thoughts

Introduction: Anxiety is a psychological disorder, which cognition of its causes is essential. The aim of this study was to examine of emotional regulation difficulties and negative automatic thoughts in the prediction of students' anxiety Islamic Azad University, Bukan Branch. Methods: The method used is descriptive- correlation. The statistical population of this study includes all of college...

متن کامل

AN-EUL method for automatic interpretation of potential field data in unexploded ordnances (UXO) detection

We have applied an automatic interpretation method of potential data called AN-EUL in unexploded ordnance (UXO) prospective which is indeed a combination of the analytic signal and the Euler deconvolution approaches. The method can be applied for both magnetic and gravity data as well for gradient surveys based upon the concept of the structural index (SI) of a potential anomaly which is relate...

متن کامل

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

 In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Publication Ethics: A Case Series with Recommendations According to Committee on Publication Ethics (COPE)

Ethical misconduct is not a new issue in the history of science and literature. However, ethical misconducts in science have grown considerably in the modern era which is due to emphasis on the scientific proliferation in research institutes and gauging scientists according to their publications. In the current case series, several misconducts occurring over the previous years in Mashhad Univer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005